skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Zhang, Nan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available October 19, 2026
  2. Abstract—Intuitively, the more complex a software system is, the harder it is to maintain. Statistically, it is not clear which complexity metrics correlate with maintenance effort; in fact, it is not even clear how to objectively measure maintenance burden, so that developers’ sentiment and intuition can be supported by numbers. Without effective complexity and maintenance metrics, it remains difficult to objectively monitor maintenance, control complexity, or justify refactoring. In this paper, we report a large-scale study of 1252 projects written in C++ and Java from Google LLC. We collected three categories of metrics: (1) architectural complexity, measured using propagation cost (PC), decoupling level (DL), and structural anti-patterns; (2) maintenance activity, measured using the number of changes, lines of code (LOC) written, and active coding time (ACT) spent on feature-addition vs. bug-fixing, and (3) developer sentiment on complexity and productivity, collected from 7200 survey responses. We statistically analyzed the correlations among these metrics and obtained significant evidence of the following findings: 1) the more complex the architecture is (higher propagation cost, more instances of anti-patterns), the more LOC is spent on bug-fixing, rather than adding new features; 2) developers who commit more changes for features, spend more lines of code on features, or spend more time on features also feel that they are less hindered by technical debt and complexity. To the best of our knowledge, this is the first large-scale empirical study establishing the statistical correlation among architectural complexity, maintenance activity, and developer sentiment. The implication is that, instead of solely relying upon developer sentiment and intuition to detect degraded structure or increased burden to evolve, it is possible to objectively and continuously measure and monitor architectural complexity and maintenance difficulty, increasing feature delivery efficiency by reducing architectural complexity and anti-patterns. 
    more » « less
    Free, publicly-accessible full text available April 28, 2026
  3. Abstract—Intuitively, the more complex a software system is, the harder it is to maintain. Statistically, it is not clear which complexity metrics correlate with maintenance effort; in fact, it is not even clear how to objectively measure maintenance burden, so that developers’ sentiment and intuition can be supported by numbers. Without effective complexity and maintenance metrics, it remains difficult to objectively monitor maintenance, control complexity, or justify refactoring. In this paper, we report a large-scale study of 1252 projects written in C++ and Java from Google LLC. We collected three categories of metrics: (1) architectural complexity, measured using propagation cost (PC), decoupling level (DL), and structural anti-patterns; (2) maintenance activity, measured using the number of changes, lines of code (LOC) written, and active coding time (ACT) spent on feature-addition vs. bug-fixing, and (3) developer sentiment on complexity and productivity, collected from 7200 survey responses. We statistically analyzed the correlations among these metrics and obtained significant evidence of the following findings: 1) the more complex the architecture is (higher propagation cost, more instances of anti-patterns), the more LOC is spent on bug-fixing, rather than adding new features; 2) developers who commit more changes for features, spend more lines of code on features, or spend more time on features also feel that they are less hindered by technical debt and complexity. To the best of our knowledge, this is the first large-scale empirical study establishing the statistical correlation among architectural complexity, maintenance activity, and developer sentiment. The implication is that, instead of solely relying upon developer sentiment and intuition to detect degraded structure or increased burden to evolve, it is possible to objectively and continuously measure and monitor architectural complexity and maintenance difficulty, increasing feature delivery efficiency by reducing architectural complexity and anti-patterns. 
    more » « less
    Free, publicly-accessible full text available April 30, 2026
  4. Free, publicly-accessible full text available March 12, 2026
  5. Free, publicly-accessible full text available January 1, 2026
  6. Objective:Recurrent respiratory papillomatosis (RRP) is a rare disease of the airway for which there is no known cure. Treatment involves the surgical removal or destruction of these lesions. There has been a long-standing debate over the effectiveness of the adjuvant intralesional injection of the immune modifying agent bevacizumab. This study is a systematic review investigating the effect of adjuvant intralesional bevacizumab on patients with laryngeal papillomatosis. The main objective was to assess functional outcomes and efficacy. Data Sources:Pubmed, Google Scholar, and Web of Science. Review Methods:Search words were “intralesional bevacizumab” AND “recurrent respiratory papillomatosis.” Sources were systematically identified using inclusion and exclusion criteria (ie, study publication must post-date 2000, must be peer-reviewed, investigate patients with RRP, apply bevacizumab intralesionally, not systemically). Findings were then collected and analyzed. Results:Ten studies were included for analysis. The majority of these studies found an increase in the surgical interval, voice outcomes, and a decrease in tumor burden in most patients. No studies reported side effects or lasting complications related to the bevacizumab injection. Conclusion:This systematic review provides further evidence for the safety of intralesional bevacizumab injections and their likely positive effect on disease control. Future research would benefit from the implementation of standardized documentation of RRP outcomes. 
    more » « less
  7. Free, publicly-accessible full text available January 1, 2026
  8. Boolean matrix factorization (BMF) has been widely utilized in fields such as recommendation systems, graph learning, text mining, and -omics data analysis. Traditional BMF methods decompose a binary matrix into the Boolean product of two lower-rank Boolean matrices plus homoscedastic random errors. However, real-world binary data typically involves biases arising from heterogeneous row- and column-wise signal distributions. Such biases can lead to suboptimal fitting and unexplainable predictions if not accounted for. In this study, we reconceptualize the binary data generation as the Boolean sum of three components: a binary pattern matrix, a background bias matrix influenced by heterogeneous row or column distributions, and random flipping errors. We introduce a novel Disentangled Representation Learning for Binary matrices (DRLB) method, which employs a dual auto-encoder network to reveal the true patterns. DRLB can be seamlessly integrated with existing BMF techniques to facilitate bias-aware BMF. Our experiments with both synthetic and real-world datasets show that DRLB significantly enhances the precision of traditional BMF methods while offering high scalability. Moreover, the bias matrix detected by DRLB accurately reflects the inherent biases in synthetic data, and the patterns identified in the bias-corrected real-world data exhibit enhanced interpretability. 
    more » « less
  9. Biofilms are clusters of microorganisms that form at various interfaces, including those between air and liquid or liquid and solid. Due to their roles in enhancing wastewater treatment processes, and their unfortunate propensity to cause persistent human infections through lowering antibiotic susceptibility, understanding and managing bacterial biofilms is of paramount importance. A pivotal stage in biofilm development is the initial bacterial attachment to these interfaces. However, the determinants of bacterial cell choice in colonizing an interface first and heterogeneity in bacterial adhesion remain elusive. Our research has unveiled variations in the buoyant density of free-swimming Staphylococcus aureus cells, irrespective of their growth phase. Cells with a low cell buoyant density, characterized by fewer cell contents, exhibited lower susceptibility to antibiotic treatments (100 μg/mL vancomycin) and favored biofilm formation at air–liquid interfaces. In contrast, cells with higher cell buoyant density, which have richer cell contents, were more vulnerable to antibiotics and predominantly formed biofilms on liquid–solid interfaces when contained upright. Cells with low cell buoyant density were not able to revert to a more antibiotic sensitive and high cell buoyant density phenotype. In essence, S. aureus cells with higher cell buoyant density may be more inclined to adhere to upright substrates. 
    more » « less
  10. Boolean matrix factorization (BMF) has been widely utilized in fields such as recommendation systems, graph learning, text mining, and -omics data analysis. Traditional BMF methods decompose a binary matrix into the Boolean product of two lower-rank Boolean matrices plus homoscedastic random errors. However, real-world binary data typically involves biases arising from heterogeneous row- and column-wise signal distributions. Such biases can lead to suboptimal fitting and unexplainable predictions if not accounted for. In this study, we reconceptualize the binary data generation as the Boolean sum of three components: a binary pattern matrix, a background bias matrix influenced by heterogeneous row or column distributions, and random flipping errors. We introduce a novel Disentangled Representation Learning for Binary matrices (DRLB) method, which employs a dual auto-encoder network to reveal the true patterns. DRLB can be seamlessly integrated with existing BMF techniques to facilitate bias-aware BMF. Our experiments with both synthetic and real-world datasets show that DRLB significantly enhances the precision of traditional BMF methods while offering high scalability. Moreover, the bias matrix detected by DRLB accurately reflects the inherent biases in synthetic data, and the patterns identified in the bias-corrected real-world data exhibit enhanced interpretability. 
    more » « less